skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Xu, Yuanchao"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 20, 2026
  2. Recent years have witnessed increasing interest in machine learning (ML) inferences on serverless computing due to its auto-scaling and cost-effective properties. However, one critical aspect, function granularity, has been largely overlooked, limiting the potential of serverless ML. This paper explores the impact of function granularity on serverless ML, revealing its important effects on the SLO hit rates and resource costs of serverless applications. It further proposes adaptive granularity as an approach to addressing the phenomenon that no single granularity fits all applications and situations. It explores three predictive models and presents programming tools and runtime extensions to facilitate the integration of adaptive granularity into existing serverless platforms. Experiments show adaptive granularity produces up to a 29.2% improvement in SLO hit rates and up to a 24.6% reduction in resource costs over the state-of-the-art serverless ML which uses fixed granularity. 
    more » « less
    Free, publicly-accessible full text available March 6, 2026
  3. Shared memory system-on-chips (SM-SoCs) are ubiquitously employed by a wide range of computing platforms, including edge/IoT devices, autonomous systems, and smartphones. In SM-SoCs, system-wide shared memory enables a convenient and cost-effective mechanism for making data accessible across dozens of processing units (PUs), such as CPU cores and domain-specific accelerators. Due to the diverse computational characteristics of the PUs they embed, SM-SoCs often do not employ a shared last-level cache (LLC). Although covert channel attacks have been widely studied in shared memory systems, high-throughput communication has previously been feasible only by relying on an LLC or by possessing privileged or physical access to the shared memory subsystem. In this study, we introduce a new memory-contention-based covert communication attack, MC3, which specifically targets shared system memory in mobile SoCs. Unlike existing attacks, our approach achieves high-throughput communication without the need for an LLC or elevated access to the system. We explore the effectiveness of our methodology by demonstrating the trade-off between the channel transmission rate and the robustness of the communication. We evaluate MC3 on NVIDIA Orin AGX, NX, and Nano platforms and achieve transmission rates up to 6.4 Kbps with less than 1% error rate. 
    more » « less
    Free, publicly-accessible full text available March 31, 2026
  4. Shared memory system-on-chips (SM-SoCs) are ubiquitously employed by a wide range of computing platforms, including edge/IoT devices, autonomous systems, and smartphones. In SM-SoCs, system-wide shared memory enables a convenient and cost-effective mechanism for making data accessible across dozens of processing units (PUs), such as CPU cores and domain-specific accelerators. Due to the diverse computational characteristics of the PUs they embed, SM-SoCs often do not employ a shared last-level cache (LLC). Although covert channel attacks have been widely studied in shared memory systems, high-throughput communication has previously been feasible only by relying on an LLC or by possessing privileged or physical access to the shared memory subsystem. In this study, we introduce a new memory-contention-based covert communication attack, MC3, which specifically targets shared system memory in mobile SoCs. Unlike existing attacks, our approach achieves high-throughput communication without the need for an LLC or elevated access to the system. We explore the effectiveness of our methodology by demonstrating the trade-off between the channel transmission rate and the robustness of the communication. We evaluate MC3 on NVIDIA Orin AGX, NX, and Nano platforms and achieve transmission rates up to 6.4 Kbps with less than 1% error rate. 
    more » « less
    Free, publicly-accessible full text available March 31, 2026
  5. Disaggregated memory systems achieve resource utilization efficiency and system scalability by distributing computation and memory resources into distinct pools of nodes. RDMA is an attractive solution to support high-throughput communication between different disaggregated resource pools. However, existing RDMA solutions face a dilemma: one-sided RDMA completely bypasses computation at memory nodes, but its communication takes multiple round trips; two-sided RDMA achieves one-round-trip communication but requires non-trivial computation for index lookups at memory nodes, which violates the principle of disaggregated memory. This work presents Outback, a novel indexing solution for key-value stores with a one-round-trip RDMA-based network that does not incur computation-heavy tasks at memory nodes. Outback is the first to utilize dynamic minimal perfect hashing and separates its index into two components: one memory-efficient and compute-heavy component at compute nodes and the other memory-heavy and compute-efficient component at memory nodes. We implement a prototype of Outback and evaluate its performance in a public cloud. The experimental results show that Outback achieves higher throughput than both the state-of-the-art one-sided RDMA and two-sided RDMA-based in-memory KVS by 1.06--5.03×, due to the unique strength of applying a separated perfect hashing index. 
    more » « less